Finding the Number of Clusters using Visual Validation VAT Algorithm

نویسندگان

  • G. Komarasamy
  • Amitabh Wahi
چکیده

Clustering is the process of combining a set of data in such a way that data in the same group are more similar to each other than the groups (clusters). K-Means is an algorithm for widely used in clustering techniques. But in this algorithm some of the issues are determined i.e. K-value selected by user is the main disadvantage. To overcome the drawback visual methods such as the VAT algorithm generally used for cluster analysis, also it is used to obtain the k-value prior to clustering. But the estimated result does not match with the true (but unknown) value in many cases. Then Spectral VAT algorithm was implemented. This spec-VAT algorithm is more efficient than VAT algorithm for complex data sets. The Spec-VAT based algorithms such as A Spec-VAT, P Spec-VAT and E Spec-VAT is also used to find out the cluster value efficiently. But the range of k value is either directly or indirectly given to spectral based VAT algorithms. In this paper we propose direct visual validation method and divergence matrix. In this proposed work the value of k or the range of k is neither directly nor indirectly specified by the users. Instead of k value, we propose a new method of comparing objects and from that result. We choose an object which is closer than other object, From the VVAT (Visual Validation VAT) algorithm the experimental result shows that the proposed algorithm is much better than the other algorithms. Keyword-VAT algorithm, visual validation, divergence matrix, VVAT algorithm

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Visual Analysis Method for Cluster Tendency Evaluation, Data Partitioning and Internal Cluster Validation

Visual methods have been extensively studied and performed in cluster data analysis. Given a pairwise dissimilarity matrix D of a set of n objects, visual methods such as Enhanced-Visual Assessment Tendency (E-VAT) algorithm generally represent D as an n × n image I(D) where the objects are reordered to expose the hidden cluster structure as dark blocks along the diagonal of the image. A major ...

متن کامل

A New Implementation of the co-VAT Algorithm for Visual Assessment of Clusters in Rectangular Relational Data

This paper presents a new implementation of the co-VAT algorithm. We assume we have an m× n matrix D, where the elements of D are pair-wise dissimilarities betweenm row objectsOr and n column objectsOc. The union of these disjoint sets are (N = m + n) objects O. Clustering tendency assessment is the process by which a data set is analyzed to determine the number(s) of clusters present. In 2007,...

متن کامل

A Comparative study of Clustering in Unlabelled Datasets Using Extended Dark Block Extraction and Extended Cluster Count Extraction

One of the major problems in cluster analysis is the determination of the number of clusters in unlabeled data prior to clustering. In this paper, we implement a new method for determining the number of clusters called Extended Dark Block Extraction (EDBE), which is based on an existing algorithm for Visual Assessment of Cluster Tendency (VAT) of a data set. Its basic steps include 1) Generatin...

متن کامل

A New Approach in Strategy Formulation using Clustering Algorithm: An Instance in a Service Company

The ever severe dynamic competitive environment has led to increasing complexity of strategic decision making in giant organizations. Strategy formulation is one of basic processes in achieving long range goals. Since, in ordinary methods considering all factors and their significance in accomplishing individual goals are almost impossible. Here, a new approach based on clustering method is pro...

متن کامل

Parallel Visual Assessment of Cluster Tendency on GPU

Determining the number of clusters in a data set is a critical issue in cluster analysis. The Visual Assessment of (cluster) Tendency (VAT) algorithm is an effective tool for investigating cluster tendency, which produces an intuitive image of matrix as the representation of complex data sets. However, VAT can be computationally expensive for large data sets due to its O N2 ð Þ time complexity....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013